369 research outputs found

    Safe and complete contig assembly via omnitigs

    Full text link
    Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a genome graph GG (e.g. a de Bruijn, or a string graph), what are all the strings that can be safely reported from GG as contigs? In this paper we finally answer this question, and also give a polynomial time algorithm to find them. Our experiments show that these strings, which we call omnitigs, are 66% to 82% longer on average than the popular unitigs, and 29% of dbSNP locations have more neighbors in omnitigs than in unitigs.Comment: Full version of the paper in the proceedings of RECOMB 201

    Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes

    Get PDF
    This study was funded by the Sars Centre core budget to M. Adamska. Sequencing was performed at the Norwegian High Throughput Sequencing Centre funded by the Norwegian Research Council. O.M.R. and D.E.K.F. acknowledge support from the BBSRC and the School of Biology, University of St Andrews.Sponges are simple animals with few cell types, but their genomes paradoxically contain a wide variety of developmental transcription factors1,2,3,4, including homeobox genes belonging to the Antennapedia (ANTP) class5,6, which in bilaterians encompass Hox, ParaHox and NK genes. In the genome of the demosponge Amphimedon queenslandica, no Hox or ParaHox genes are present, but NK genes are linked in a tight cluster similar to the NK clusters of bilaterians5. It has been proposed that Hox and ParaHox genes originated from NK cluster genes after divergence of sponges from the lineage leading to cnidarians and bilaterians5,7. On the other hand, synteny analysis lends support to the notion that the absence of Hox and ParaHox genes in Amphimedon is a result of secondary loss (the ghost locus hypothesis)8. Here we analysed complete suites of ANTP-class homeoboxes in two calcareous sponges, Sycon ciliatum and Leucosolenia complicata. Our phylogenetic analyses demonstrate that these calcisponges possess orthologues of bilaterian NK genes (Hex, Hmx and Msx), a varying number of additional NK genes and one ParaHox gene, Cdx. Despite the generation of scaffolds spanning multiple genes, we find no evidence of clustering of Sycon NK genes. All Sycon ANTP-class genes are developmentally expressed, with patterns suggesting their involvement in cell type specification in embryos and adults, metamorphosis and body plan patterning. These results demonstrate that ParaHox genes predate the origin of sponges, thus confirming the ghost locus hypothesis8, and highlight the need to analyse the genomes of multiple sponge lineages to obtain a complete picture of the ancestral composition of the first animal genome.PostprintPeer reviewe

    The post-vaccine microevolution of invasive Streptococcus pneumoniae

    Get PDF
    The 7-valent pneumococcal conjugated vaccine (PCV7) has affected the genetic population of Streptococcus pneumoniae in pediatric carriage. Little is known however about pneumococcal population genomics in adult invasive pneumococcal disease (IPD) under vaccine pressure. We sequenced and serotyped 349 strains of S. pneumoniae isolated from IPD patients in Nijmegen between 2001 and 2011. Introduction of PCV7 in the Dutch National Immunization Program in 2006 preluded substantial alterations in the IPD population structure caused by serotype replacement. No evidence could be found for vaccine induced capsular switches. We observed that after a temporary bottleneck in gene diversity after the introduction of PCV7, the accessory gene pool re-expanded mainly by genes already circulating pre-PCV7. In the post-vaccine genomic population a number of genes changed frequency, certain genes became overrepresented in vaccine serotypes, while others shifted towards non-vaccine serotypes. Whether these dynamics in the invasive pneumococcal population have truly contributed to invasiveness and manifestations of disease remains to be further elucidated. We suggest the use of whole genome sequencing for surveillance of pneumococcal population dynamics that could give a prospect on the course of disease, facilitating effective prevention and management of IPD

    Kinetoplastid Phylogenomics Reveals the Evolutionary Innovations Associated with the Origins of Parasitism

    Get PDF
    The evolution of parasitism is a recurrent event in the history of life and a core problem in evolutionary biology. Trypanosomatids are important parasites and include the human pathogens Trypanosoma brucei, Trypanosoma cruzi, and Leishmania spp., which in humans cause African trypanosomiasis, Chagas disease, and leishmaniasis, respectively. Genome comparison between trypanosomatids reveals that these parasites have evolved specialized cell-surface protein families, overlaid on a well-conserved cell template. Understanding how these features evolved and which ones are specifically associated with parasitism requires comparison with related non-parasites. We have produced genome sequences for Bodo saltans, the closest known non-parasitic relative of trypanosomatids, and a second bodonid, Trypanoplasma borreli. Here we show how genomic reduction and innovation contributed to the character of trypanosomatid genomes. We show that gene loss has “streamlined” trypanosomatid genomes, particularly with respect to macromolecular degradation and ion transport, but consistent with a widespread loss of functional redundancy, while adaptive radiations of gene families involved in membrane function provide the principal innovations in trypanosomatid evolution. Gene gain and loss continued during trypanosomatid diversification, resulting in the asymmetric assortment of ancestral characters such as peptidases between Trypanosoma and Leishmania, genomic differences that were subsequently amplified by lineage-specific innovations after divergence. Finally, we show how species-specific, cell-surface gene families (DGF-1 and PSA) with no apparent structural similarity are independent derivations of a common ancestral form, which we call “bodonin.” This new evidence defines the parasitic innovations of trypanosomatid genomes, revealing how a free-living phagotroph became adapted to exploiting hostile host environments

    Horizontal DNA transfer mechanisms of bacteria as weapons of intragenomic conflict

    Get PDF
    Horizontal DNA transfer (HDT) is a pervasive mechanism of diversification in many microbial species, but its primary evolutionary role remains controversial. Much recent research has emphasised the adaptive benefit of acquiring novel DNA, but here we argue instead that intragenomic conflict provides a coherent framework for understanding the evolutionary origins of HDT. To test this hypothesis, we developed a mathematical model of a clonally descended bacterial population undergoing HDT through transmission of mobile genetic elements (MGEs) and genetic transformation. Including the known bias of transformation toward the acquisition of shorter alleles into the model suggested it could be an effective means of counteracting the spread of MGEs. Both constitutive and transient competence for transformation were found to provide an effective defence against parasitic MGEs; transient competence could also be effective at permitting the selective spread of MGEs conferring a benefit on their host bacterium. The coordination of transient competence with cell-cell killing, observed in multiple species, was found to result in synergistic blocking of MGE transmission through releasing genomic DNA for homologous recombination while simultaneously reducing horizontal MGE spread by lowering the local cell density. To evaluate the feasibility of the functions suggested by the modelling analysis, we analysed genomic data from longitudinal sampling of individuals carrying Streptococcus pneumoniae. This revealed the frequent within-host coexistence of clonally descended cells that differed in their MGE infection status, a necessary condition for the proposed mechanism to operate. Additionally, we found multiple examples of MGEs inhibiting transformation through integrative disruption of genes encoding the competence machinery across many species, providing evidence of an ongoing "arms race." Reduced rates of transformation have also been observed in cells infected by MGEs that reduce the concentration of extracellular DNA through secretion of DNases. Simulations predicted that either mechanism of limiting transformation would benefit individual MGEs, but also that this tactic's effectiveness was limited by competition with other MGEs coinfecting the same cell. A further observed behaviour we hypothesised to reduce elimination by transformation was MGE activation when cells become competent. Our model predicted that this response was effective at counteracting transformation independently of competing MGEs. Therefore, this framework is able to explain both common properties of MGEs, and the seemingly paradoxical bacterial behaviours of transformation and cell-cell killing within clonally related populations, as the consequences of intragenomic conflict between self-replicating chromosomes and parasitic MGEs. The antagonistic nature of the different mechanisms of HDT over short timescales means their contribution to bacterial evolution is likely to be substantially greater than previously appreciated

    Deciphering the genome repertoire of Pseudomonas sp. M1 toward β-Myrcene biotransformation

    Get PDF
    Pseudomonas sp. M1 is able to mineralize several unusual substrates of natural and xenobiotic origin, contributing to its competence to thrive in different ecological niches. In this work, the genome of M1 strain was resequenced by Illumina MiSeq to refine the quality of a published draft by resolving the majority of repeat-rich regions. In silico genome analysis led to the prediction of metabolic pathways involved in biotransformation of several unusual substrates (e.g., plant-derived volatiles), providing clues on the genomic complement required for such biodegrading/biotransformation functionalities. Pseudomonas sp. M1 exhibits a particular sensory and biotransformation/biocatalysis potential toward β-myrcene, a terpene vastly used in industries worldwide. Therefore, the genomic responsiveness of M1 strain toward β-myrcene was investigated, using an RNA sequencing approach. M1 cells challenged with β-myrcene(compared with cells grown in lactate) undergo an extensive alteration of the transcriptome expression profile, including 1,873 genes evidencing at least 1.5-fold of altered expression (627 upregulated and 1,246 downregulated), toward β-myrcene-imposed molecular adaptation and cellular specialization. A thorough data analysis identified a novel 28-kb genomic island, whose expression was strongly stimulated in β-myrcene-supplemented medium, that is essential for β-myrcene catabolism. This island includes β-myrcene-induced genes whose products are putatively involved in 1) substrate sensing, 2) gene expression regulation, and 3) β-myrcene oxidation and bioconversion of β-myrcene derivatives into central metabolism intermediates. In general, this locus does not show high homology with sequences available in databases and seems to have evolved through the assembly of several functional blocks acquired from different bacteria, probably, at different evolutionary stages.Acknowledgments This work was supported by FEDER through POFC— COMPETE and by national funds from Foundation for Science and Technology (Portugal) through the projects PEst-C/BIA/UI4050/2011, PTDC/EBB-BIO/104980/2008 and PTDC/BIA-MIC/113733/09, and through a PhD grant (grant number SFRH/BD/76894/2011) to P.S.-C.info:eu-repo/semantics/publishedVersio

    An efficient approach to BAC based assembly of complex genomes

    Get PDF
    Background: There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate ‘gold’ reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. Results: We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. Conclusions: We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes

    Bacterial endosymbiont Cardinium cSfur genome sequence provides insights for understanding the symbiotic relationship in Sogatella furcifera host

    Get PDF
    Background: Sogatella furcifera is a migratory pest that damages rice plants and causes severe economic losses. Due to its ability to annually migrate long distances, S.furcifera has emerged as a major pest of rice in several Asian countries. Symbiotic relationships of inherited bacteria with terrestrial arthropods have significant implications. The genus Cardinium is present in many types of arthropods, where it influences some host characteristics. We present a report of a newly # identified strain of the bacterial endosymbiont Cardinium cSfur in S. furcifera. Result: From the whole genome of S. furcifera previously sequenced by our laboratory, we assembled the whole genome sequence of Cardinium cSfur. The sequence comprised 1,103,593 bp with a GC content of 39.2%. The phylogenetic tree of the Bacteroides phylum to which Cardinium cSfur belongs suggests that Cardinium cSfur is closely related to the other strains (Cardinium cBtQ1 and cEper1) that are members of the Amoebophilaceae family. Genome comparison between the host-dependent endosymbiont including Cardinium cSfur and freeliving bacteria revealed that the endosymbiont has a smaller genome size and lower GC content, and has lost some genes related to metabolism because of its special environment, which is similar to the genome pattern observed in other insect symbionts. Cardinium cSfur has limited metabolic capability, which makes it less contributive to metabolic and biosynthetic processes in its host. From our findings, we inferred that, to compensate for its limited metabolic capability, Cardinium cSfur harbors a relatively high proportion of transport proteins, which might act as the hub between it and its host. With its acquisition of the whole operon related to biotin synthesis and glycolysis related genes through HGT event, Cardinium cSfur seems to be undergoing changes while establishing a symbiotic relationship with its host. Conclusion: A novel bacterial endosymbiont strain (Cardinium cSfur) has been discovered. A genomic analysis of the endosymbiont in S. furcifera suggests that its genome has undergone certain changes to facilitate its settlement in the host. The envisaged potential reproduction manipulative ability of the new endosymbiont strain in its S. furcifera host has vital implications in designing eco-friendly approaches to combat the insect pest
    corecore